Adaptive Fault Tolerance for Spacecraft1

نویسندگان

  • Myron Hecht
  • Herbert Hecht
چکیده

1 0-7803-5846-5/00/$10.00 © 2000 IEEE Abstract— This paper describes the design and implementation of a software infrastructure for real-time fault tolerance for applications on long duration deep space missions. The infrastructure has advanced capabilities for Adaptive Fault Tolerance (AFT), i.e., the ability to change the recovery strategy based on the failure history, available resources, and the operating environment. The AFT technology can accommodate adaptive or fixed recovery strategies. Adaptive fault tolerance allows the recovery strategy to be changed on the basis of the mission phase, failure history, and environment. For example, during a phase where power consumption must be minimized, there would be only one processor in operation. Thus, the recovery strategy would be to restart and retry. On the other hand, if the mission phase were in a time-critical mode (e.g., orbital insertion, encounter, etc.), then, multiple processors would be running, and the recovery strategy would be to switch from a leader copy to a follower copy of the control software. In a fixed recovery strategy, there is a specified redundant resource which is committed when certain failure conditions occur. The most obvious example of a fixed recovery strategy is to switch over to the standby processor in the event of any failure of the active processor.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CAFT: Cost-aware and Fault-tolerant routing algorithm in 2D mesh Network-on-Chip

By increasing, the complexity of chips and the need to integrating more components into a chip has made network –on- chip known as an important infrastructure for network communications on the system, and is a good alternative to traditional ways and using the bus. By increasing the density of chips, the possibility of failure in the chip network increases and providing correction and fault tol...

متن کامل

Reliability and Performance Evaluation of Fault-aware Routing Methods for Network-on-Chip Architectures (RESEARCH NOTE)

Nowadays, faults and failures are increasing especially in complex systems such as Network-on-Chip (NoC) based Systems-on-a-Chip due to the increasing susceptibility and decreasing feature sizes. On the other hand, fault-tolerant routing algorithms have an evident effect on tolerating permanent faults and improving the reliability of a Network-on-Chip based system. This paper presents reliabili...

متن کامل

The Chameleon Infrastructure for Adaptive, Software Implemented Fault Tolerance

This paper presents Chameleon, an adaptive software infrastructure for supporting different levels of availability requirements in a heterogeneous networked environment. Chameleon provides dependability through the use of ARMORs—Adaptive, Reconfigurable, and Mobile Objects for Reliability. Three broad classes of ARMORs are defined: Managers, Daemons, and Common ARMORs. Key concepts that support...

متن کامل

Orthogonal Fault Tolerance for Dynamically Adaptive Systems

In dynamic systems that adapt to users’ needs and changing environments, dependability needs cannot be avoided. This paper proposes an orthogonal fault tolerance model as a means to manage and reason about multiple fault tolerance mechanisms that co-exist in dynamically adaptive systems. One of the key challenges associated with dynamically evolving fault tolerance needs is the feature interact...

متن کامل

Replication and Resubmission Based Adaptive Decision for Fault Tolerance in Real Time Cloud Computing: A New Approach

Cloud computing an adoptable technology is the upshot evolution of on demand service in the computing epitome of immense scale distributed computing. With the raising asks and welfares of cloud computing infrastructure, society can take leverage of intensive computing capability services and scalable, virtualized vicinity of cloud computing to carry out real time tasks executed on a remote clou...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002